International Journal of Education and Development using Information and Communication Technology 
(IJEDICT), 2010, Vol. 6, Issue 3, pp. 124-146. 


Extracting and comparing the intricacies of metadiscourse of two written 

persuasive corpora 

Chan Swee Heng and Helen Tan 
Universiti Putra Malaysia, Malaysia 


ABSTRACT 

Previous studies (Wu 2007; Hyland, 2004; Hyland & Tse 2004; Intaraprawat & Steffensen 1995; 
Crismore et al 1993; Vande Kopple 1985) have established the use of Metadiscourse (MD) as an 
essential element in writing as it allows the writer to create a dialogic space with his readers. In 
recent years, attempts have been made to analyse MD through the use of text corpus with the 
help of computer technology especially when the corpus is large. In this investigation, data have 
been obtained through an electronic means to illustrate the use of MD in writing samples of a 
group of Malaysian undergraduates. In order to investigate the use of MD by these students, their 
writing was benched against an established standard, the open access BAWE corpus, available 
online. The MD features were analysed through the concordancing software, monoConc Pro 2.2, 
for this research. The paper demonstrates how the software manages the data to reveal patterns 
of use between writers of the two corpora. The paper concludes on initial insights obtained from 
the comparison to show the nature and manner of MD between standard proficient writing 
(extract from BAWE corpus) and evolving student writing at the tertiary level that would have 
implications for writing improvement in educational institutions. 

Keywords: student writing; concordancing software; MD; text analysis; BA WE corpus 


WHAT IS METADISCOURSE? 

Metadiscourse (MD) has been defined in a number of ways by different researchers. Williams 
(2007, p. 65), defines it as “the language that writers use to refer not to the substance of their 
ideas but to themselves, their readers, or their writings”. Similarly, Vande Kopple (1985, p.83) 
classifies MD into a number of features stating that MD is the: 

linguistic material that does not add prepositional meaning to the content but signals the 
presence of the writer. 

This notion is also held by Crismore et al (1993). They further added that MD helps both readers 
and listeners to “organize, interpret and evaluate the information given” (p.40). This functional 
interpretation to the definition of MD is supported by Hyland and Tse (2004) who elaborate that it 
is a useful linguistic resource that writers can use to communicate to their readers their stance 
and attitude towards the given proposition, thus emphasising the interactive perspective. 

Its importance in writing cannot be disputed and over the past decades, the study of MD has 
garnered much attention from researchers of Second Language (L2) writings. This is evidenced 
by the number of studies that ranged from classification to cross-cultural studies on MD. 
Researchers such as Vande Kopple (1985), Crismore et al (1993) and Hyland (2005) have 
classified MD into different functional categories to explain the workings of MD. Vande Kopple 
(1985) categorised MD into two main domains - textual and interpersonal. The ‘textual domain’ 
helps writers link their propositions in a cohesive manner and the ‘interpersonal’ provides writers 
the avenue to convey their feelings towards the given propositions. The textual MD is exemplified 
through the use of ‘text connectives’ and ‘code glosses’ while the ‘interpersonal MD’ is realised 



Extracting and comparing the intricacies of metadiscourse 125 


through the use of ‘illocutionary markers’, ‘validity markers’, ‘narrator’s, ‘attitude markers’ and 
‘commentary’. Based on Vande Kopple’s (1985) categorisation, Crismore et al (1993) further 
modified, collapsed and created new categories of MD. Although they retained the terminology of 
the two main domains of MD, they further sub-divided ‘textual MD’ into ‘textual markers and 
interpretative markers’. Under ‘textual markers’, they added ‘logical connectives’, ‘sequencers’, 
‘reminders’ and ‘topicalisers’. They then removed temporal connectives and narrators and 
created the code glosses, illocution markers and announcement as interpretative markers. Other 
than these frameworks, Hyland (2005) promotes the interpersonal model of MD. His model is not 
only an update on the taxonomies used by Vande Kopple (1985) and Crismore et al (1993), it 
also gives greater comprehensibility and distinction to the varieties of MD features. As a result, 
his framework is adopted in this study, though keeping in mind that it is still open for further 
refinement. Hyland (2005), in the same manner of Vande Kopple (1985) and Crismore et al 
(1993), distinguishes MD into two main domains. However, he identifies them as ‘Interactive’ and 
‘Interactional MD’. He explains that the function of the ‘interactive MD’ is to help guide readers 
through the text while that of the ‘interactional MD' is to involve the reader in the argument. 
Interaction with the reader is firmly anchored in his framework and he further details the 
categories of the interactive and interactional MD, providing comprehensive examples for each 
sub-category. The sub-categories of the ‘interactive’ MD are manifested as ‘transitions’, ‘frame 
markers’, ‘evidentials’, ‘endophoric markers’ and ‘code glosses’. For the ‘interactional’ MD 
categories, they are realised as ‘hedges’, ‘boosters’, ‘engagement marker’, ‘attitude markers’ and 
‘self-mention’ (the framework is presented in Table 1). 

With the advent of information and computer technology (ICT), the study on MD took on a new 
dimension. ICT made possible the investigation of large corpora through the use of concordance 
software. A case in point is the comparative study carried out by Hyland (1999) where he 
compared the use of MD in textbooks and research articles. The results showed that research 
articles have more interpersonal MD. Another corpus study on MD is Hyland’s (2004) 
investigation on the use of MD in postgraduate writings. The study revealed that doctoral theses 
have more interactive MD than masters’ theses. Interestingly, ‘evidentials’ were seen as 
appearing four times more in doctoral theses indicating the value placed on the greater use of 
citation as central to the argumentative or persuasive force of the text. Comparison of MD use 
between good and poor ESL undergraduate writers is evident in Intaraprawat and Steffensen’s 
(1995) work which found that good essays have more MD features than poor essays. 

Apart from the various studies that explore the use of different categories of MD others have been 
done to explore specific MD features. Wu (2007) concentrated on the use of engagement 
resources in high and low rated undergraduates’ geography essays, while Hyland (2001a) studies 
the importance of audience engagement in academic arguments. Harwood (2005) concentrates 
on the use of self-mention, especially the use of inclusive and exclusive pronouns, and Hyland 
(2001b) focuses on the use of self-citation and the exclusive pronouns. 

The investigation of cross cultural perspectives added another dimension to the studies on MD, 
as seen in the work by Crismore et al (1993) where they compared the use of MD in 
argumentative essays written by American and Finnish students. Both Dafouz-Milne (2008) and 
Aertselaer (2008) compared the use of MD between Spanish and English writers. While Dafouz- 
Milne (2008) focuses on the construction and attainment of persuasion in newspaper writings: 
The Times (British) and El Pais (Spanish), Aertselaer (2008) concentrated on the use of MD in 
the English-Spanish Contrastive Corpus (ESCC) argumentative texts. Culture is an important 
variable that impinges on the understanding of written language use and cultural differences are 
reflected through different thought processes in dealing with reader-writer interaction which could 
be captured in the use of MD. 



126 IJEDICT 


PURPOSE OF THE STUDY 

Previous studies on MD have one striking similarity in their choice of writing samples - writing 
taken from experienced writers. This study intends to build on the existing knowledge on MD by 
investigating the use of MD through the use of technology - that of a concordancing tool, of a 
group of first year Malaysian undergraduates (MU) enrolled in a writing skills course. Through the 
application of the computer software, the undergraduate writing is compared to a group of 
acclaimed proficient writers in terms of MD use. The writings of these acclaimed proficient writers 
were drawn from an electronic source of open access text corpora compiled by academics from 
Great Britain known as the BAWE (British Academic Written Essays) corpus. One of the 
intentions behind the compilation of the corpus was to have an international benchmark for 
student academic writing and also to provide a corpus for research studies on writing. The lack of 
borders in this global world has pushed many countries especially where English is used as an L2 
to seek indicative markers that could be used to gauge local performances as against those that 
are external and seen as desirable. Every writing teacher strives for improvement in their student 
work and a corpus of good writing could serve as a yardstick for performance which L2 learners 
could aspire to. To understand the operations of exemplary writing, specific elements could be 
isolated for meaningful exploration so that data obtained could give learning pointers for the 
writing classroom to enhance writing. Essentially, the study undertaken is to examine the 
specificities in the use of MD of local student writing with a view of its approximation to an 
international standard. 


Research Questions 

The questions that guided the research are: 

1. What are the frequency and forms of MD use in the persuasive writing of L2 Malaysian 
undergraduate (MU) writers compared to the BAWE writers? 

2. Are there any differences or similarities in the MU writers’ use of MD when compared with 
the BAWE writers? 


METHODOLOGY 

Sampling was both purposive and stratified. In the selection of the MU corpus, writing samples 
were drawn from first year university students enrolled in a writing course - General Writing Skills. 
They had obtained grades A1 and A2 for English in the Sijil Peperiksaan Malaysia (SPM) which is 
equivalent to the British O’Level on leaving secondary school. . They represented the 
undergraduate writers with high proficiency in English and the sample size amounted to 294 texts 
with a total word count of 145,425 words. A1 in SPM is a Malaysian national indicator for a large 
portion of school-leavers who have reached the prescribed standards. Our interest in this study 
focuses on students who have exited the national evaluation assessment system and are given a 
recognition_-to show a particular level of competence. It is a general criticism that students 
entering university are not able to write well in spite of attaining anAI proficiency. This has 
actually given the researchers a concern about the discrepancy between the Malaysian A1 
students and an acclaimed good writer. The BAWE corpus, as the benchmark, was identified 
through a survey of possible written text corpora available on the Internet. The BAWE corpus 
claimed that their corpus is the standard for student academic writing and it was constructed as a 
research reference for researchers to use. From the outset, it must be made known that this 
study is not a comparison of Malaysian English versus British English. Contact was established 
with the sites and after weighing the choices in terms of accessibility, and verification about the 



Extracting and comparing the intricacies of metadiscourse 127 


corpus through personal communication, including the writing genres involved, the researchers 
decided on the use of the BAWE corpus. In the selection of the BAWE corpus, reference was 
made to the Oxford Text Archives-Resource Number 2539 which was an online resource. From 
the BAWE spreadsheet, a total text number of 2761 (from Humanities, Social Studies and 
Science) was found. Text types include essays, research report, literature survey, critique, 
methodology recount, empathy writing, problem question, explanation and design specification. 
The writers were from Years 1, 2, 3 of their undergraduate studies and masters programmes. To 
ensure the relevant use of the BAWE essays, the researchers established direct contact with the 
Director of the BAWE corpus project to identify the text-type and level of study of the writers. 
Together with the help of the BAWE spreadsheet, the choice was narrowed to essay type 
(argumentative texts) and year 1 category of undergraduate writing. Since the essays in the 
BAWE corpus were obtained from students from varied disciplines, the titles of their essays were 
also varied (Nesi et al 2004, p.441). Therefore, the comparison of MD used between the BAWE 
corpus and the MU corpus was not based on similarity of essay titles but rather on the similarity 
of text rhetoric which is that of the argumentative type. In total, there were 400 texts selected with 
a total running number of 808,642 words. 


The Writing Task 

To elicit the writing evidence, students were asked to write an argumentative essay based on the 
topic of smoking, A persuasive task was chosen as it is deemed to be a rhetorical form that is 
most likely to exhibit the varieties of MD (see Appendix for the essay prompt). The task was timed 
for the writing of the final essay and the duration given for completion was 1 hour 15 minutes. 
Prior to the administration of the writing task to the participants, a pilot study was conducted on a 
group of 126 undergraduates. Based on the feedback, the writing task was further revised to 
enable all categories of MD to be captured in the participants’ writing (which included the 
evidentials and endophoric markers which were found to be lacking in the pilot analysis). 

The first draft of the essay was written in the second week of the semester. After that, input on 
MD (12 hours) was given to the undergraduate students. The instructional input of MD focused on 
samples of text that had MD features. They were taken from varied sources such as textbooks, 
articles in the Internet and newspapers. Probing questions were created after the introduction of 
each text sample to direct the subjects’ attention to the use of MD. One of the exercises in the 
instructional input included the drafting and redrafting process of written work. After acquiring the 
knowledge on the use of MD, the subjects were required to improve a given text using 
appropriate MD. This mirrored the need for redrafting of the essay they had written earlier which 
was to be done at the end of the intervention period. Thus, the first drafts of the students were 
returned and the students were told to improve on their writings in their second drafts. The 
second drafts were then collected and analysed to obtain the data for the study. 

Instrument 

To analyse the MD used, Hyland’s (2005) Interpersonal Model of MD provided the initial 
guidelines. Hyland’s framework has been chosen over others, such as Crismore et al’s (1993) 
and Vande Kopple’s (1985) after a detailed comparison has been carried out. Hyland’s (2005) 
framework is seen as the most comprehensive. This framework however is seen as evolving and 
open in the sense that studies into MD could still contribute to the building up of the MD 
categories. As such, MD features that are considered to be not fitted in the model will definitely be 
extricated as building upon the model adopted. The details of Hyland’s (2005) model are as 
follows. 



128 IJEDICT 


Table 1: An interpersonal model of MD (Hyland 2005, p. 49) 


Category 

Function 

Example 

Interactive 

Help to guide reader through the text I 

Transitions 

express semantic relation 

between main clauses 

in addition/but/thus/and 

Frame markers 

refer to discourse acts, 
sequences, or text stages 

finally/to conclude/my purpose here 
is to 

Endophoric markers 

refer to information in other parts 
of the text 

noted above/see Fig/in section 2 

Evidentials 

refer to source of information 
from other texts 

according to X/(Y 1990)Z states 

Code glosses 

help readers grasp functions of 
ideational material 

namely/e.g./such as/in other words 

Interactional 

Involve the reader in the argument 

Hedges 

withhold writer’s full commitment 
to proposition 

might/perhaps/possible/about 

Boosters 

emphasise force or writer’s 
certainty in proposition 

in fact/definitely/it is clear that 

Attitude markers 

express writer’s attitude to 
proposition 

unfortunately/I agree/surprisingly 

Engagement markers 

explicitly refer to or build 
relationship with reader 

consider/note that/you can see that 

Self-mentions 

explicit reference to author(s) 

l/we/my/our 


The instrument used to analyse the texts was the monoConc Pro version 2.2, a word 
concordancing software developed by Barlow (2003). The use of the electronic tool is adopted to 
facilitate text analysis, in particular that of MD use. 


IDENTIFICATION OF MD 

To run the programme, there are some preliminary procedures. First, only words or expressions 
that have metadiscoursal values are classified as MD. For example, transition ‘and’ is counted as 
an MD token only when it is used to link two clauses. If it is used as a linker in listing such as in 
“heart attack, strokes and cancer”, it is discounted as an MD feature. 

In the MU corpus, some metadiscoure features were found to be lifted directly from the essay 
prompt. This was another constraint in the tagging of MD. They were again ignored as a token of 
MD use by the writer. 

For words having 200 hits/matches or more (and in BAWE corpus has 23,707 hits), the list was 
randomised and the first 200 concordance lines were analysed for MD use. The number of MD 
features identified will be extrapolated as a percentage of the total number of MD features 
analysed. It is then normed to an occurrence of 10,000 words so that the MD used can be 
compared between two corpora of unequal size (MU corpus: 145,425 words and BAWE corpus: 
808,642 words). 




Extracting and comparing the intricacies of metadiscourse 129 


RESULTS AND DISCUSSION 

To begin with, a frequency count was made in the use of MD. It was found that the total number 
of words of the BAWE corpus is 808,642 words while that of the MU corpus is 145,425 words. 
The corpus size of MU is much smaller as it consists of L2 written texts with an average length of 
500 words while the text length of the BAWE corpus is between 1000 - 5000 words (Nesi et al 
2004, p.441). The frequency count is displayed below according to the two major categories of 
MD use. 


Table 2: Frequency of use of interactive and interactional MD 


BAWE Corpus (Total MU Corpus (Total words: 

words: 808,642 words)145, 425 words) 




Occurrence 


Occurrence 

MD 


per 10,000 


per 10,000 

Category 

Total Hits 

words 

Total Hits 

words 

Interactive 

30,646 

379.0 

4644 

319.3 

Interactional 

19,571 

242.0 

5151 

354.2 


It is found that the BAWE corpus has a higher frequency of use in interactive MD (379.0 
occurrences per 10,000 words compared to 242.0 occurrences per 10,000 words in interactional 
discourse). This result is similar to other MD studies (Hyland 2004, Hyland & Tse 7 2004, 
Intaraprawat & Steffenson 1995) whereby the frequency of use of the interactive MD is more 
dominant compared to the interactional MD. Interactive MD encompasses linguistic resources 
that writers use to organize and to structure their propositions so that the text would be more 
coherent to the readers. The use of transitions, frame markers, endophoric markers, evidentials 
and code glosses are examples of interactive MD. 

In direct contrast to the findings on the BAWE corpus, the MU corpus exhibited a higher 
frequency of interactional MD when compared with the interactive MD (354.2 occurrences per 
10000 words compared to 319.3 occurrences per 10,000 words for interactive MD). Interactional 
MD focuses on linguistic signals that attempt primarily to connect to the audience, such as, in the 
use of hedges, boosters, engagement markers, attitude markers and self-mention. 

It could be said that interactive signals engages the reader on a level that relates more to formal 
grammar while the interactional use relates more to the socio-affective level where audience 
engagement from that perspective is prioritised in discourse. The MU writers seem to reveal a 
lack of sensitivity and ability in achieving coherence in writing from the interactive perspective. 
This is not surprising as the MU corpus consists of texts written by L2 undergraduate writers. 
Although these L2 undergraduate writers are deemed to have high general English proficiency 
(obtaining Al and A2 in English in the secondary school leaving examination), they may not have 
attained a writing proficiency level that could mobilise all the linguistic resources of the target 
language to craft a coherent piece of academic writing in English compared to the BAWE writers. 

While interactional MD provides writers the opportunity to engage with their readers with linguistic 
resources that ‘gives life’ to the piece of writing, the writers have yet to meet the demands of 
academic writing conventions which appear to juggle a sensitive ‘balanced’ use of both types of 
MD. The MU writers may have built solidarity with their readers but are still strategising to align 






130 IJEDICT 


their writings according to the expectations of experienced readers linguistically on a more formal 
level. 

The next aspect was to examine the frequency of use according to the specific sub categories of 
interactive and interactional use. The table below shows the analysis as generated by electronic 
means for interactive MD. 

Table 3: Number of occurrences of the categories of interactive MD in the BAWE and MU corpus 


BAWE (Total words: 808,642 MU Corpus (Total words: 145. 425 

words) _ words) _ 




Occurrence 



Occurrence 


MD 


per 10,000 

%of 

Total 

per 10,000 

% of 

Category 

Total Hits 

words 

Total 

Hits 

words 

Total 

1. Interactive 
Transitions 

19,564 

241.9 

63.8 

3579 

246.1 

77.1 

Frame Markers 

939 

11.6 

3.1 

363 

25.0 

7.8 

Endophoric 

Markers 

1257 

15.5 

4.1 

148 

10.2 

3.2 

Evidentials 

5415 

67.0 

17.7 

117 

8.0 

2.5 

Code Glosses 

3471 

42.9 

11.3 

437 

30.0 

9.4 

Total 

30,646 

379.0 

100.0 

4644 

319.3 

100.0 


Among the five categories of interactive MD in the BAWE and MU corpus, transitions has the 
highest frequency of use with more than half of the total percentage of overall interactive MD 
(BAWE Corpus: 241.9 occurrences per 10,000 words; MU Corpus: 246.1 occurrences per 10,000 
words). However, between the BAWE and MU corpus, the frequency of use of transitions in the 
MU corpus is marginally higher. The next highest frequency of use in the BAWE corpus is 
evidentials (67.0 occurrences per 10,000 words) while for the MU corpus, it is the use of code 
glosses (30.0 occurrences per 10,000 words). In comparison, the use of evidentials in the MU 
corpus is the lowest with only 8.0 occurrences per 10,000 words. With the use of evidentials in 
the BAWE corpus registering eight times more than their use in the MU corpus, the BAWE writers 
seem to exhibit a greater awareness of the need to establish writers’ credibility. Using evidentials 
as Hyland (2005, p.67) puts it is “the perceived credibility that readers grant to writers". 

In the BAWE corpus, the third highest frequency of use is the use of code glosses (42.9 
occurrences per 10,000 words) and its use is also significantly higher than that in the MU corpus 
(30.0 occurrences per 10,000 words). The other two categories, frame markers and the 
endophoric markers have very low frequency of use. Each of these two categories registers less 
than 10% of the total interactive MD with the frame markers noted as only 3.1% of the total 
interactive MD (11.6 occurrences per 10,000 words). The endophoric markers account for only 
4.1% of the total interactive MD (15.5 occurrences per 10,000 words). In the MU corpus, the use 
of frame markers is the third highest (25.0 occurrences per 10,000 words) and it is marginally 
higher than those recorded in the BAWE corpus, while the use of endophoric markers accounts 
for only 3.2 % of the overall use of interactive MD (10.2 occurrences per 10,000 words). 



Extracting and comparing the intricacies of metadiscourse 131 


Table 4: Number of occurrences of the categories of interactional MD in the BA WE and MU 
corpus 


BAWE (Total words: 808,642 MU Corpus (Total words: 145. 425 

words) _ words) _ 




Occurrence 



Occurrence 


MD 


per 10,000 

%of 


per 10,000 

%of 

Category 

2. Interactional 

Total Hits 

words 

Total 

Total Hits 

words 

Total 

Hedges 

9331 

115.4 

47.7 

819 

56.3 

15.9 

Boosters 

3966 

49.0 

20.3 

966 

66.4 

18.8 

Engagement 

Markers 

3704 

45.8 

18.9 

2918 

200.7 

56.6 

Attitude 

Markers 

1521 

18.8 

7.8 

166 

11.4 

3.2 

Self Mention 

1049 

13.0 

5.4 

282 

19.4 

5.5 

Total 

19571 

242.0 

100.0 

5151 

354.2 

100.0 


Table 4 displays the frequency observed in the use of interactional MD. Of the five categories, 
hedges has the most number of occurrences (115.4 occurrences per 10,000 words) followed by 
boosters (49.0 occurrences per 10,000 words) and engagement markers (45.8 occurrences per 
10,000 words) in the BAWE corpus. In contrast, the MU corpus exhibited an extremely high 
occurrence of engagement markers when compared with the other interactional categories. It 
accounts for 200.7 occurrences per 10,000 words while boosters which is the next highest 
frequency of use only accounts for 66.4 occurrences per 10,000 words and hedges has a 
frequency of use of 56.3 occurrences per 10,000 words. 

The function of both hedges and boosters are in a way diametrical. If the function of the hedges is 
to tone down assertions, the function of boosters is to increase the force of the assertion. In an 
academic discourse, the careful balance of the use of both hedges and boosters is important as 
they reflect the writers’ ability to balance a show of their confidence with caution. Essentially, the 
balance reveals the writers’ readiness to accept alternative views while at the same time there is 
evidence of their own confidence of their own propositions. 

In the MU corpus, the dominant use of engagement markers, particularly the use of the inclusive 
pronoun ‘we’ indicates writers’ sensitivity to include the readers into their arguments. The use of 
such engagement markers at strategic points in the text enhances writer-reader solidarity, 
facilitating the readers to accept the argument. However, the high use of boosters compared with 
the use of hedges seems to indicate the writer’s over-confidence of their argument to the 
exclusion of being modest when presenting one’s viewpoint. Perhaps in this area, undergraduate 
writers need to be taught to write with greater caution by using more hedges than boosters. As 
claimed by Williams (2007), confident writers use more hedges than boosters because they do 
not want to appear too assertive. In short, a persuasive piece of writing with an adequate 
balanced use of hedges and boosters would aid readers to accept the argument more readily. 

The lowest use of the interactional MD in the BAWE corpus is self mention with only 13.0 
occurrences per 10,000 words, while attitude markers account for 18.8 occurrences per 10,000 
words. Similarly, in the MU corpus, both the attitude markers and the self mention showed the 
lowest frequency of use. However, in the MU corpus, the lowest frequency of use was seen in 
attitude markers (11.4 occurrences per 10,000 words) followed by self mention (19.4 occurrences 



132 IJEDICT 


per 10,000 words). Although their use is significantly lower compared to other categories of 
interactional MD, it does indicate that the writers have the repertoire of MD skills. 


FORMS OF MD 

Hyland (2005) has categorised ten different types of MD, with each category realised through a 
variety of forms. As one of the universal properties of human language is creativity (Fromkin et al 
2007), it is to be expected that writers have a wide mental list of lexicons to express their 
thoughts. In other words, each category of MD can be realised linguistically through a variety of 
forms. It is also this very characteristic of human language that the analysis of any MD features 
needs to be done in context as any linguistic realisation can be interpreted as having either 
propositional or metadiscoursal meaning. Below are some discussions on the linguistic 
expressions of the different categories on MD used by the writers of both the BAWE and MU 
corpus obtained from the concordancing display made possible by the electronic programme. 


Transitions 

In the use of transitions, the five most common linguistic realisations in the BAWE and MU corpus 
are the use of co-ordinating and sub-ordinating conjunctions. In the BAWE corpus, the co¬ 
ordinating conjunctions are realised through the use of ‘and’, ‘but’ and ‘also’. In the MU Corpus, 
the co-ordinating conjunctions are realised mainly through the use of ‘also’, ‘and’, but’and ‘so’. 
As for the use of subjunctive conjunctions, the BAWE writers preferred ‘however’ and ‘because’ 
while the MU writers tend to use the word ‘because’. 


Table 5: The first five preferred forms of transitions in BA WE corpus and MU corpus 




BAWE Corpus 


MU Corpus 

NO 

Forms of 
Transitions 

Total 

Hits 

Total 

Occurrence 
per 10,000 
words 

Forms of 
Transitions 

Total 

Hits 

Total 

Occurrence 
per 10,000 
words 

1 

And 

7112 

87.9 

Also 

707 

48.6 

2 

But 

2108 

26.1 

Because 

665 

45.7 

3 

Also 

1760 

21.8 

And 

453 

31.2 

4 

However 

1638 

20.3 

But 

406 

27.9 

5 

Because 

917 

11.3 

So 

396 

27.2 


Overall, the BAWE corpus showed more variety of linguistic expressions used as transitions. Of 
the total of 47 forms of transitions analysed, the BAWE corpus recorded 46 varieties of linguistic 
forms while the MU corpus recorded only 31 varieties of linguistic forms. This leads us to 
conclude that although the MU writers attempted to use transitions in their writings, their varieties 
of forms used are more restricted compared to the BAWE writers. 

Frame Markers 

In both the BAWE and MU corpus (Table 6), the five most preferred frame markers are mostly 
those that signal the sequence of the text structure. For the BAWE corpus, the five most preferred 
forms (in order of the most to least preferred forms) are ‘firstly’, ‘in conclusion’, ‘then’, ‘finally’ and 
‘first’. In the MU corpus, the preferred forms are ‘first’, ‘in conclusion’, ‘finally’, ‘last’ and ‘firstly’. To 



Extracting and comparing the intricacies of metadiscourse 133 


mark the beginning of a sequence of ideas, the BAWE writers prefer the form ‘firstly’ as compared 
with the writers of the MU corpus who preferred the form ‘first’. 


Table 6: The first five preferred forms of frame markers in BA WE corpus and MU corpus 


BAWE Corpus _ MU Corpus 




Total 

Total 


Total 

Total 



Hits 

Occurrence 

Forms of 

Hits 

Occurrence 


Forms of Frame 


per 10,000 

Frame 


per 10,000 

NO 

Markers 


words 

Markers 


words 

1 

Firstly 

124 

1.533435 

First 

72 

5.0 

2 

In conclusion 

96 

1.187176 

In conclusion 

60 

4.1 

3 

Then 

95 

1.174809 

Finally 

44 

3.0 

4 

Finally 

80 

0.989313 

Last 

27 

1.9 

5 

First 

74 

0.915114 

Firstly 

26 

1.8 


The function of ‘last’ can be expressed in a variety of manner and the concordance lines in the 
MU corpus reveal that the form ‘last but not least’ occurred no less than 20 times. Such a form is 
not present in the BAWE corpus but it seems to be the preferred frame markers among the MU 
writers. This expression, however, is generally considered to be a cliche and inappropriate for 
academic writing. The MU writers will have to ‘unlearn’ the expression to be more discerning. 

Similar to the use of transitions, the BAWE corpus also reveals a wider variety of forms in the use 
of frame markers compared to the MU corpus. While the MU corpus exhibited 20 forms, the 
BAWE corpus has a total of 43 forms of frame markers. The BAWE writers were able to use 
sequencing such as ‘firstly’, ‘to begin’, ‘in this chapter’, ‘in this part’, ‘in this section’. One possible 
reason that the last three forms are absent could be that MU writing was only in the form of an 
essay. Therefore, it may not have been necessary for the writers to resort to the use of these 
frame markers. However, there was a marked absence in the MU corpus the following 
expressions: ‘to conclude’, ‘thus far’, ‘to summarise’, ‘in sum’, ‘in summary’, ‘to sum up’, ‘at this 
point’, ‘on the whole’, ‘at this stage’, ‘to sum up’, ‘for the moment’. Besides, frame markers that 
announce goals such as ‘objective’, ‘intend to’, ‘aim’, ‘purpose’, ‘wish to’ and those that denote a 
shift in topic, such as, ‘back to’, ‘with regard to’, ‘move on’, ‘return to’, ‘turn to’ were also not found 
in the MU corpus. We may conclude that the BAWE writers had a wider and richer repertoire of 
frame markers compared to the MU writers. 


Endophoric Markers 

Endophoric markers are a form of MD that refers readers to information in other parts that are 
within or outside the text. This intratextual feature is used to provide support to the argument with 
the purpose of convincing readers of the validity of the argument. In the BAWE corpus, the 
following linguistic expressions ‘p.X’, ‘X above’, ‘page X’, ‘X earlier’, ‘in section X’/‘the X section’, 
Table X’, ‘X below’, ‘X later’, ‘in chapter X’/‘the X chapter’, ‘X before’, ‘in part X’/‘the X part’, ‘see 
X’, and ‘Fig. X’ (see Table 7 below). In the MU corpus, however, only five different linguistic 
variations of endophoric markers were used. They were Table X’, ‘X above’, ‘X earlier’, ‘X before’ 
and ‘X below’. One possible reason for the limited linguistic choices could be the length of the MU 
essays (500 to 600 words). Below are examples of the linguistic realisations as revealed by the 
concordance lines. 



134 IJEDICT 


Table 7: The first five preferred forms of endophoric markers in BAWE corpus and MU corpus 


BAWE Corpus _ MU Corpus 




Total 

Total 


Total 

Total 


Forms of 

Hits 

Occurrence 

Forms of 

Hits 

Occurrence 


Endophoric 


per 10,000 

Endophoric 


per 10,000 

NO 

Markers 


words 

Markers 


words 

1 

p.X 

687 

8.5 

Table X 

114 

7.8 

2 

X above 

118 

1.5 

X above 

20 

1.4 

3 

page X 

94 

1.2 

X earlier 

6 

0.4 

4 

X earlier 

84 

1.0 

X before 

4 

0.3 

5 

(in) section XI the 
X section 

57 

0.7 

X below 

4 

0.3 


The use of the form p. X/p X in the BAWE corpus could be closely linked to the use of citation as 
a persuasive strategy in the crafting of academic writing. Reference to the page number 
complements the citation process and it reinforces the validity and reliability of the writers’ 
arguments. 


Evidentials 


Table 8: The first eight forms of evidentials in BAWE corpus and MU corpus 




BAWE Corpus 



MU Corpus 


NO 

Forms of 
Evidentials 

Total 

Hits 

Total 

Occurrence 
per 10,000 
words 

Forms of 
Evidentials 

Total 

Hits 

Total 

Occurrence 
per 10,000 
words 

1 

name/date 

4926 

60.9 

According to X 

96 

6.6 

2 

cited in 

190 

2.3 

X states/ state 
that 

13 

0.9 

3 

According to X 

131 

1.6 

Said 

7 

0.5 

4 

X states/ state 
that 

79 

1.0 

quoted in/as 
quoted 

1 

0.1 

5 

quoted in/as 
quoted 

48 

0.6 

name/date 

0 

0.0 

6 

Said 

35 

0.4 

(to) cite X 

0 

0.0 

7 

(to) cite X 

3 

0.0 

(to) quote X 

0 

0.0 

8 

(to) quote X 

3 

0.0 

cited in 

0 

0.0 


As seen in the Table 8, evidentials can also be realised through various forms. It is clear that the 
BAWE corpus utilizes more varieties of evidentials than the MU corpus. At the top of the list is the 
use of the common citation convention techniques where the names of the author, followed by 
the year of publication and the page number, are cited. The forms of evidentials that are least 
used in the BAWE corpus are: ‘(to) cite X...’ and ‘to quote...’. 





Extracting and comparing the intricacies of metadiscourse 135 


In the MU corpus, the form ‘as quoted’ is the least preferred form. There was only one such 
instance in the whole corpus. The other four forms such as name/date, (to) cite X, (to) quote X, 
and cited in were also not present in the MU corpus. 


Code Glosses 


Table 9: The first five preferred forms of code glosses in BAWE corpus and MU corpus 




BAWE Corpus 


MU Corpus 

NO 

Total 

Hits 

Forms of Code 

Glosses 

Total 

Occurrence 
per 10,000 
words 

Forms of 
Evidentials 

Total 

Hits 

Total 

Occurrence 
per 10,000 
words 

1 

such as 

976 

12.1 

such as 

207 

14.2 

2 

( ) 

454 

5.6 

for example 

95 

6.5 

3 

or X 

366 

4.5 

in fact 

21 

1.4 

4 


338 

4.2 

or X 

21 

1.4 

5 

Indeed 

231 

2.9 

( ) 

17 

1.2 


Based on the figures in the table above, it is observed that the most preferred form of code 
glosses in both the corpora is the use of ‘such as’. A parallel form to ‘such as’ is ‘for example’ 
which is a preferred form in the MU corpus. 

The use of the bracket and dash indicates that MD, particularly in the use of code glosses, is not 
realised by words alone. Punctuation marks also have metadiscoursal meaning which allow 
writers to elaborate further on what they have conveyed in writing._The conjunction ‘or’ is also 
used as code glosses in both the BAWE and MU corpus. Forms such as ‘as a matter of fact’ and 
‘put another way’ found in the MU corpus were not found at all in the BAWE corpus. However, the 
dash ’(-)’, ‘e.g.’,’ i.e.’, ‘indeed’, ‘namely’, ‘that is to say’, and ‘viz’ were not found in the MU corpus. 








136 IJEDICT 


Hedges 

Table 10: The first five preferred forms of hedges in BAWE corpus and MU corpus 




BAWE Corpus 


MU Corpus 

NO 

Forms of 
Hedges 

Total 

Hits 

Total 

Occurrence 
per 10,000 
words 

Forms of 
Hedges 

Total 

Hits 

Total 

Occurrence 
per 10,000 
words 

1 

Would 

1840 

22.8 

May 

177 

12.2 

2 

May 

1234 

15.3 

Would 

98 

6.7 

3 

Could 

981 

12.1 

About 

96 

6.6 

4 

Perhaps 

433 

5.4 

Might 

72 

5.0 

5 

Seems 

387 

4.8 

Could 

63 

4.3 


In examining the use of hedges, the concordancing outcomes revealed that the prevalent form is 
the use of the auxiliary verb. The first three preferred forms in the BAWE corpus are ‘would’, 
‘may’ and ‘could’. Similarly, the MU corpus also exhibited the use of auxiliary verbs where the 
first, second, fourth and fifth preferred forms are ‘may’, ‘would’, ‘might’ and ‘could’ respectively. 
The third preferred form is the adverbial ‘about’. The fourth preferred form in the BAWE corpus is 
the adverbial ‘perhaps’ with the fifth, a linking verb, ‘seems’. 

In the MU corpus, the adverbial used is the form ‘about’. Almost all instances of the use of ‘about’ 
in the MU corpus were found before the presentation of statistical information. This indicated that 
the MU writers were cautious not to demonstrate that they have absolute knowledge on the 
proposition. (Please see the concordance output below). 

1. ... 06, the smokers among women increasing [[about]] 0.3 %from the year before and for t 

2. ... cigarette smoking. Whereas, there is [[about]] 0.9 % of men deaths caused by cigarette 
3 ... creasing from year 2006 to year 2007 at [[about]] 1.3 % because most of them came from 
4. ... e age of 18 start smoking and currently [[about]] 1 in 5 teenagers smoke . Many steps h .. 

(Essays of MU participants) 

Hedges can be linked to the expression of uncertainty or a lack of commitment to the truth of the 
propositions. In choosing to use the auxiliaries, the writers are conveying the message that they 
may not possess the absolute knowledge on the subject. In a persuasive discourse, this strategy 
is necessary to win the acceptance of the readers towards the arguments. Although the MU 
corpus also uses the linking verb ‘seems’, it is not dominant compared to the BAWE corpus. Just 
like the use of adverbials, the use of the linking verb, ‘seems’ indicates uncertainty. It allows 
writers to convey the message that the meaning of the proposition is only ‘somewhat’ and not 
absolutely the case. 

Apart from the above forms, other forms of hedges are also detected, such as ‘suggests’, ‘claim’, 
‘possible’, etc. Their distributions in the two corpora were quite varied. The BAWE corpus had a 
total of 343 hits for ‘suggests’ while in the MU corpus, it only recorded a total of three hits. In 
another instance, the word ‘possible’ garnered a total of 280 hits while there was none found in 
the MU corpus. It is apparent then that the use of hedges in the MU corpus is not only less 
frequent but is also less varied. The analysis points to the likelihood of the need of a greater 
awareness of the role of hedges in L2 writing. Successful writers usually are able to hedge more. 
This fact has been highlighted in Williams’ (2007) writing manual and has also been proven in 



Extracting and comparing the intricacies of metadiscourse 137 


Intaraprawat and Steffenson’s (1995) study. Thus, L2 writers would need more training to 
enhance their use of this aspect of metadiscourse. 

Boosters 

Table 11: The first five preferred forms of boosters in BAWE corpus and MU corpus 




BAWE Corpus 


MU Corpus 

NO 

Forms of 
Boosters 

Total 

Hits 

Total 

Occurrence 
per 10,000 
words 

Forms of 
Boosters 

Total 

Hits 

Total 

Occurrence 
per 10,000 
words 

1 

Must 

741 

9.2 

Very 

215 

14.8 

2 

Very 

705 

8.7 

Must 

181 

12.4 

3 

Indeed 

266 

3.3 

Always 

115 

7.9 

4 

Never 

264 

3.3 

Actually 

109 

7.5 

5 

Always 

253 

3.1 

Really 

97 

6.7 


Three of the common forms that appear in both the BAWE and MU corpus are ‘must’, ‘very’ and 
‘always’ (table 11). These expressions in the form of modal verbs, modifiers or adverbs, play a 
similar role, that is, they accentuate the certainty of the propositions. Similarly, the other forms 
found in the table above - ‘indeed’, and ‘never’ in the BAWE corpus and ‘actually’ and ‘really’ in 
the MU corpus also increase the certainty of the proposition. 

Akin to the other categories of MD discussed earlier, it is again found that boosters used in the 
BAWE corpus were more varied. In the BAWE corpus, a total of 38 different linguistic expressions 
were recorded while the MU corpus only had 24 different linguistic expressions of this category. 
The BAWE corpus also recorded more forms, having over 100 hits (12 in total) while the MU has 
only four forms (see Table 12). 


Table 12: Forms of boosters in BAWE corpus and MU corpus with more than 100 hits 




BAWE Corpus 


MU Corpus 

NO 

Forms of 
Boosters 

Total 

Hits 

Total 

Occurrence 
per 10,000 
words 

Forms of 
Boosters 

Total 

Hits 

Total 

Occurrence 
per 10,000 
words 

1 

Must 

741 

9.2 

Very 

215 

14.8 

2 

Very 

705 

8.7 

Must 

181 

12.4 

3 

Indeed 

266 

3.3 

Always 

115 

7.9 

4 

Never 

264 

3.3 

Actually 

109 

7.5 

5 

Always 

253 

3.1 




6 

Clearly 

215 

2.7 




7 

In fact 

209 

2.6 




8 

Actually 

184 

2.3 




9 

Certainly 

110 

1.4 




10 

Obvious 

106 

1.3 




11 

True 

105 

1.3 




12 

Really 

102 

1.3 






138 IJEDICT 


Engagement Markers 

Table 13: The first five preferred forms of engagement markers in BAWE corpus and MU corpus 



BAWE Corpus 

MU Corpus 

NO 

Forms of 

Engagement 

Markers 

Total 

Hits 

Total 

Occurrence 
per 10,000 
words 

Forms of 

Engagement 

Markers 

Total 

Hits 

Total 

Occurrence 
per 10,000 
words 

1 

We (inclusive) 

1601 

19.8 

We (inclusive) 

974 

67.0 

2 

Our 

568 

7.0 

Our 

811 

55.8 

3 

Us (inclusive) 

476 

5.9 

You 

383 

26.3 

4 

? 

441 

5.5 

? 

259 

17.8 

5 

You 

243 

3.0 

Your 

197 

13.5 


‘Engagement markers’ provide the avenue for writers to build solidarity with their readers. The 
inclusive ‘we’ topped the list of the five most preferred forms of engagement markers. Other forms 
of engagement markers in the above list were the use of the pronoun ‘you’, possessive pronoun 
‘our’ and ‘your’ and the object form of the plural ‘we’ and ‘us’. 

It is interesting to note that the writers from both corpora also made the effort to engage their 
readers by using questions. This is a useful strategy as posing a rhetorical question in the text 
draws the readers to participate actively in the process of the argument. From the concordance 
lines, it is seen that although the MU writers made attempts to use questions to engage their 
readers, there were, however, instances where the questions were not constructed correctly, or 
the writers had used awkward sentence structures. This inability to use standard sentence 
construction is a common problem among L2 writers. As evolving writers attempt to write in the 
target language, there is bound to be some confusion because of their first language interference 
or inability to acquire the structures is yet in the developmental stage. 

Just as there are dominant linguistic expressions of engagement markers in the corpora, there 
are also linguistic realisations of engagement markers that have very low frequency of use. The 
forms of engagement markers that only account for between one to ten hits in the BAWE corpus 
are the following: ‘imagine’, ‘notice’, ‘assume’, ‘look at’, ‘let’s’, ‘observe’, ‘suppose’, ‘by the way’, 
and ‘think abouT. As for the MU corpus, the forms of engagement markers that have ten hits or 
less are ‘think about’, ‘let us’, ‘note’, ‘notice’, ‘by the way’ and ‘suppose’. Why this is so merits 
further investigation. 







Extracting and comparing the intricacies of metadiscourse 139 


Attitude Markers 

Table 14: The first five preferred forms of attitude markers in BAWE corpus and MU corpus 



BAWE Corpus 


MU Corpus 

NO 

Forms of 

Attitude 

Markers 

Total 

Hits 

Total 

Occurrence 
per 10,000 
words 

Forms of 

Attitude 

Markers 

Total 

Hits 

Total 

Occurrence 
per 10,000 
words 

1 

Important 

686 

8.5 

! 

74 

5.1 

2 

Essential 

137 

1.7 

Unfortunately 

25 

1.7 

3 

Interesting 

78 

1.0 

Important 

19 

1.3 

4 

Essentially 

71 

0.9 

Agree 

12 

0.8 

5 

Dramatic 

55 

0.7 

Disagree 

6 

0.4 


In comparing the use of attitude markers, it is observed that they are realised through the use of 
adjectives, adverbs, attitude verbs and punctuation (table 14). In the BAWE corpus, the first five 
preferred forms of attitude markers were mostly the use of adjectives (‘important’, ‘essential’, 
‘interesting’ and ‘dramatic’) and one adverb (‘essentially’). In the MU corpus, however, there were 
more varieties. Besides the use of adjectives (‘important’), the writers also used punctuation (‘I’) 
as well as adverbs (‘unfortunately’) and attitude verbs (‘agree’ and ‘disagree’). The use of these 
different linguistic realisations allows readers to understand not just the propositional content but 
also the stance of the writers towards the propositions. The use of attitude markers is essential in 
a persuasive written text as a text devoid of attitude markers would be too arid and impersonal. 

Apart from the five most preferred forms of attitude markers, there are other forms of attitude 
markers used in the two corpora. In the MU corpus, however, their use is not that significant with 
a total of less than five hits. The BAWE corpus on the other hand, exhibited a total of 23 different 
linguistic realisations. This difference again speaks of a wider repertoire of linguistic realisations 
that are used by the BAWE writers. 


Self Mention 

Table 15: The five preferred forms of self mention in BAWE corpus and the three preferred forms 
of MU corpus 




BAWE Corpus 


MU Corpus 

NO 

Forms of 

Self Mention 

Total 

Hits 

Total 

Occurrence 
per 10,000 
words 

Forms of Self 
Mention 

Total 

Hits 

Total 

Occurrence 
per 10,000 
words 

1 

1 

758 

9.4 

1 

223 

15.3 

2 

My 

175 

2.2 

My 

39 

2.7 

3 

Me 

62 

0.8 

Me 

20 

1.4 

4 

The author 

24 

0.3 




5 

The writer 

13 

0.2 





Table 15 shows, the first five preferred forms of self mention in the BAWE corpus (out of a total of 
nine different forms). In the MU corpus, however, there were only three preferred forms. Linguistic 
forms such as ‘mine’, ‘the author’, ‘the author’s’, ‘this author’, ‘the writer’ and ‘the writer’s’ were 



140 IJEDICT 


uncommon in the MU corpus. Alluding themselves as ‘the writer’ or ‘the author’ in an expository 
essay may sound too formal as the topic of the essay is about persuading young people not to 
smoke. On the contrary, in the BAWE corpus, the topic for the academic assignment is formal 
and longer covering three broad disciplines: humanities, social sciences and science (Nesi et al, 
p.443). Therefore, their use may be deemed necessary. 

It was also found that the three most preferred forms for both corpora were the same and these 
occur in the same order of frequency (highest to lowest). They are the first person pronoun, T, 
possessive pronoun, ‘my’, and the normative case of T and ‘me’. All these three forms allow 
writers to intrude into the text and signal their presence. Of the three forms, the first person 
pronoun, ‘I’, allows writers to invariably state their stand in relation to their arguments. 

As found in the data display generated, the first person pronoun allows writers to categorically 
state their stand such as ‘I strongly believe’, ‘I strongly discourage’ or ‘I absolutely agree’. In 
addition, the use of the first person pronoun also demonstrates the writers’ personal feelings 
towards the proposition as in the expressions ‘I hope’, ‘I admit’, or ‘I’m sure’. 

In the BAWE corpus, the first person pronoun is also used for the same effect. Furthermore, 
writers also use the first person pronoun to inform the writers of their intention such as ‘I will look 
at’, ‘I will then explore’ and ‘I will limit’. This can also be said for the use of linguistic expression 
‘me’. Similar to the use of the first person pronoun, the use of ‘me’ also indicates the writer’s 
presence. As highlighted by Hyland (2005, p.53), the signaling of the writer’s presence or 
absence in the text is a matter of the writer’s conscious choice. In stating their stance, writers 
have the liberty to project their presence or distance themselves from their proposition depending 
on how they would like to relate their argument to their readers or their community’s expectation. 

The possessive pronoun is used to signal that the proposition given is entirely the writers’ and not 
anyone else. They are expressed in linguistic choices such as ‘in my opinion’, ‘in my essay’, ‘for 
my decision’, ‘in my view’, etc. 


CONCLUSION 

There is much evidence in the use of MD in both the BAWE corpus and the MU, evidence made 
possible for easy display by the concordancing tool. While there were similarities in use, the 
results showed a major difference in the frequency of occurrence and distribution and variety of 
forms in both the corpora. The data supports the conclusion that the BAWE corpus exhibited a 
greater use of MD. The BAWE also showed more use of the interactive MD while the MU corpus 
exhibited more interactional MD. This phenomenon reflects that the BAWE writers could be more 
concerned with and were more aware of the internal structuring of their argument, while the 
writers of MU corpus were more inclined towards building readers’ relationship in forwarding their 
arguments. The MU writers likely need to improve on the internal structuring of their text to show 
more sophistication in their writing. 

Among the different categories of interactive MD, transitions seem to enjoy the highest frequency 
of use. This is not surprising as transitions consist of conjunctions that allow writers to link their 
ideas in the text and they are generally much emphasised in the teaching of writing. As for the 
least frequently used, frame markers dominate for the BAWE corpus and the endophoric markers 
for the MU corpus. The reason is likely that the essays in both the corpora are not exceptionally 
long thus they do not enable the writers to exhibit a greater use of frame markers or endophoric 
markers. 



Extracting and comparing the intricacies of metadiscourse 141 


For interactional MD, the use of hedges was more dominant in the BAWE corpus while in the MU 
corpus, engagement markers were most frequently used. This indicates that the BAWE writers 
are more conscious of the need to hedge while the MU writers appear to focus more on the 
building of solidarity with their readers to engage them in their arguments. It could also be 
construed that the more proficient BAWE writers were more aware of the demands of academic 
writing whereby hedging invites readers to accept the propositions made by the writer in a subtle 
way. The L2 MU writers, on the other hand, showed a preeminence of the use of the engagement 
marker ‘we’ as a solidarity marker. This marker if overused could sound overbearing. Again 
relative effects of MD need to be explained and greater use of hedges deserves more highlight in 
teaching L2 writing. 

Apart from the frequency of use, the forms of MD between the two corpora are also markedly 
different. One common thread that runs through the BAWE corpora was the consistent use of a 
greater variety of forms among the BAWE writers. In addition, the forms in the BAWE corpus also 
had a higher number of electronic hits. Therefore, we can conclude that the BAWE writers as 
better writers were more conscious of the features of formal academic writing and were also able 
to use more appropriate and varied MD forms in the writings to convey their thoughts. In addition, 
it was also generally noted that the greater number of hits in the context of occurrences per 
10,000 words seem to indicate that many of the MD features are spaced out. It could lead to the 
conclusion that MU writers use much shorter sentence construction for idea connection. A study 
into the connection between MD features and sentence length or idea units is potentially 
invigorating. 

Although the MU writers had obtained a distinction (A1 and A2) in their English subject for their 
higher secondary school leaving certificate, their proficiency level of the English language still 
needs to be improved. As developing writers in the English language, they not only have to 
grapple with the syntactic and morphological rules of the language but also the writing genre 
conventions of a discourse community. This problem or difficulty was also voiced by Intaraprawat 
and Steffensen (1995). The lack of vocabulary and knowledge on syntactic rules of the language 
hamper the effort of some of the MU writers to construct meaningful sentences that would aid the 
readers’ understanding of the texts. For these reasons, attempts to use metadiscoursal features 
were sometimes found in non-standard sentence construction. Therefore, from the results 
obtained, we can conclude that MU writers are still at the evolving writer’s stage and have not 
approximated closely to the writing ability of the BAWE writers. 

Writing is also culture bound and therefore may be characterised by idiosyncratic use. The same 
can be said of MD as a feature of writing (cf. Crismore et all993; Dafouz-Milne 2008; Aertselaer 
2008). However, in the case of undergraduate academic writing, the norm that instructors would 
aspire is one that is generally seen to be acceptable as good writing which can exist regardless of 
situation and place, though criterion features undoubtedly are influenced by LI writing 
development from the west. In understanding emerging L2 writers using the English language, 
there is certainly a place for cross-cultural investigation to establish the ‘norms’ that characterise 
their writing and this would certainly include the use of MD. The explication of these particularised 
‘norms’ would serve as a comparison between cultural perspectives to achieve an understanding 
of underlying influences in L2 writing, and thence could move to the next stage of showing a 
conscious adapting perhaps of prevalent expectations of a broader and more influential discourse 
community. 

To conclude, some relevant points can be reiterated about technology and writing. First the 
electronic concordancing tool can be seen as a powerful facilitator in revealing textual differences 
between groups. Patterns of use and frequency of use of particular features of MD could not be 
so readily accessible had it not been for this facilitation. The electronic tool enabled the human 
eye to survey a comprehensive display of the forms of MD. Such a display would have been too 



142 IJEDICT 


complicated for the human hand to manage without technological aid. Using the tool has also 
given the researchers a means of analysis that is objective and fast once the parameters are set. 
The Internet has also helped pull together resources that are managed online such as the BAWE 
corpus, enabling open access that transcends borders. 

In the age of the Internet, electronic texts captured in large corpora are now more amenable for 
comparative analysis. In this manner, standards are made available for researchers and in this 
case, to benchmark developmental writing. The learners could benefit as they can be made to be 
more consciously aware of the importance of writing conventions in the crafting of successful 
writing. The teacher, on the other hand, could use the output of the concordance lines as 
authentic teaching resources to raise students’ awareness on the construction of linguistic 
realisations of appropriate MD accompanied by appropriate syntax and morphology of English 
language use. Exposing students to real life writing examples of how more experienced writers 
manage their writing will serve as an eye opener to the world of a wide array of rich discourse 
possibilities, and to the cultures of writing. New writing goals in education could evolve to guide 
undergraduate writers to greater writing maturity aided by technological use. 


REFERENCES 

Aertselaer, J.N.V. 2008, “Arguing in English and Spanish: A corpus study of stance”, Cambridge 
ESOL: Research Notes , no. 33, pp. 28-33 Retrieved 10 May 2009: 
http://www.cambridgeesol.org/rs_notes/rs_nts33.pdf 

Barlow, M. 2003, Concordancing and corpus analysis using MP 2.2, Athelstan, Houston. 

Crismore, A., Markkanen, R. & Steffensen, M.S. 1993, “Metadiscourse in persuasive writing: A 
study of texts written by American and Finnish university students”, Written 
Communication, vol. 10, no.1, pp. 39-71. 

Dafouz-Milne, E. 2008, “The pragmatic role of textual and interpersonal metadiscourse markers in 
the construction and attainment of persuasion: A cross-linguistic study of newspaper 
discourse”, Journal of Pragmatics, vol. 40, no. 1, 95-113. 

Fromkin, V., Rodman, R. & Hyams, N. 2007, An Introduction to Language, 8th edn. Thomson 
Wadsworth, Boston. 

Harwood, N. 2005, “We do not seem to have a theory ...The theory I present here attempts to fill 
this gap: Inclusive and exclusive pronouns in academic writing:, Applied Linguistics, vol. 
26, no. 3, pp. 343-375. 

Hyland, K. 2001a, “Bringing in the reader: Addressee features in academic articles”, Written 
communication, vol. 18, no. 4, pp. 549-574. 

Hyland, K. 2001b, “Humble servants of the discipline? Self-mention in research articles”, English 
for Specific Purposes, vol. 20, no. 3, pp. 207-226. 

Hyland, K. 2004. Disciplinary interactions: Metadiscourse in L2 postgraduate writing. Retrieved 26 
June 2007 at http://www.sciencedirect.com . 


Hyland, K. 2005, Metadiscourse, Continuum, New York. 




Extracting and comparing the intricacies of metadiscourse 143 


Hyland, K. & Tse, P. 2004, “MD in academic writing: A reappraisal”, Applied Linguistics, vol. 25, 
no. 2, pp. 156-177. 

Intaraprawat, P. & Steffensen, M.S. 1995, “The use of MD in good and poor ESL essays”, 
Journal of Second Language Writing, vol. 4, no.3, pp. 253-272. 

Nesi, H., Sharpling, G. & Ganobesik-Williams, L. 2004, “Student papers across the 
curriculum: Designing and developing a corpus of British student-writing”, 

Computer and Composition, vol. 21, no. 4, pp. 439-450. 

Vande Kopple W. J. 1985, “Some exploratory discourse on metadiscourse”, College Composition 
and Communication, vol. 36, no.1, pp. 82-93. 

Williams, J. 2007, Style: Ten Lessons in Clarity and Grace, 9th edn., Pearson-Longman, New 
York. 

Wu, S. M. 2007, “The use of engagement resources in high and low rated undergraduate 

geography essays”, Journal of English for Academic Purposes, vol.6, no.3, pp.254-271. 



144 IJEDICT 


APPENDIX 


WRITING TASK 
Duration: 1 hour 15 minutes 


Situation: 

You are concerned that despite efforts by the government to discourage young people 
from smoking, the number of smokers particularly among teenagers is on the rise. You 
are also unhappy about how cigarette companies have exploited young children in 
marketing their products. You have read several articles about the problem and extracts 
from three articles are shown below. 


Extract A 


Tobacco’s Influence on Asian Youth 
By: Simone Provence 

http://bang-ishotvou.liveioumal./2003/com/ 

As teenage girls are included in the group of young, and often underage, smokers, many tobacco 
advertisements in Asia are geared towards female’s self-esteem to encourage them to smoke for a 
variety of reasons. In a mall in the Philippines during 1992 there was an advertisement of a 
beautiful, young Asian woman wearing a Marlboro jacket and hat surrounded by giant Marlboro 
signs and posters that caught the eyes of every shopper in the mall[5]. Similar events using highly 
attractive, young Asian women in Cambodia during 2001 and Vietnam in 1996 for cigarette ads 
lead young girls to believe that smoking makes them more attractive, and therefore they pick up 
the deadly, addicting habit[6]. Tobacco, specifically cigarettes, has been used as an appetite 
suppressant for years now, and young women world-wide still think that smoking will help 
prevent them from gaining weight. Movie advertisements in Asia reflect beautiful young women 
smoking cigarettes, which in turn leads to an increase in smoking among female teenagers who 
desire the beauty of the actress they see holding the cigarette. 

Similar to the toy and video game advertisements, and aside from movie ads and rock stars that 
imply to young children and teens that smoking is cool, other children are also partly responsible 
for the increase in popularity of cigarette smoking. Lower class youth view fellow friends and 
classmates from higher income families as role models and look up to them. Those higher class 
youth are the ones who can afford to, and often do, smoke, which makes smoking look cool and 
desirable[10]. 











Extracting and comparing the intricacies of metadiscourse 145 


Extract B 

Tobacco Control 2004; 13:ii37-ii42 
© 2004 BMJ Publishing Group Ltd 

Industry sponsored youth smoking prevention programme in Malaysia: a case study in 

duplicity 

M Assunta, S Chapman 

School of Public Health, University of Sydney, Sydney, NSW, Australia 
"We strongly believe that children should not smoke and that smoking should only be for adults 
who understand the risks associated with it". This statement was made by British American 
Tobacco (BAT) Malaysia in its first social report released in June 2002.J_ Tobacco companies in 
Malaysia have been collaborating on youth smoking prevention programmes since 1994.2 The top 
three tobacco transnationals, BAT, Philip Morris (PM), and the former RJ Reynolds (RJR), have 
conducted three joint anti-smoking campaigns directed at youth: "Youth Should Not Smoke" 
(1997); "No Sale to Under 18s" (1998); "On top of the World-Without Smoking" (2001). 

The Control of Tobacco Products Regulation 1993 banned directtobacco advertisements, the sale 
of cigarettes to under 18 year olds, and prohibited this group from purchasing cigarettes or 
smoking. However, brand stretching activities and sponsorship of sports and entertainment events 
remained legal and extremely widespread. 3 On average, every day about 50 teenagers below the 
age of 18 years start smoking in Malaysia and currently about one in five teenagers smoke.4 
Smoking prevalence amongteenage boys aged 12-18 years is 30% while smoking among girls has 
doubled from 4.8% in 1996 to 8% in 1999.4 The second national health and morbidity survey in 
1996 reported the public health sector has not acted in a timely manner to curb the marketing 
tactics of the transnational tobacco companies and this "failure to act aggressively from the 1970s 
has made action in the 1990s more difficult".5 Between 1986 and 1996 there was a 67% increase 
in the number of teenage smokers.6 Ninety eight per cent of the tobacco market in Malaysia is 
controlled by transnational tobacco companies.7 

In 1997 the Malaysian government fully endorsed the industry sponsored youth smoking 
prevention (YSP) programme. This endorsement put the tobacco industry in a favourable position 
to influence the government in its tobacco control efforts. While the influence of industry 
sponsored YSP programmes in preventing effective tobacco control legislation has already been 
documented, 8-10 this paper provides further insight to a developing country’s situation where, in 
the absence of a government anti-smoking campaign, the industry successfully used the YSP 
programme to counter legislation restricting tobacco marketing and continued to promote tobacco 
to youth. 








146 IJEDICT 


Extract C 

Table 1.1 Estimated death caused by cigarette smoking in Malaysia: 2005, 2006, 
2007 


Estimated number of deaths caused by cigarette smoking 

Year 

Women 

Men 

2005 

8.5 % 

21.4% 

2006 

8.8% 

21.6% 

2007 

10.1% 

22.3% 


Source: Health Digest, January 2008 


Task: 

As a responsible journalist in a newspaper agency, write a 5 paragraph persuasive essay 
(which includes an introductory paragraph, three developmental paragraphs and a 
concluding paragraph) that discourages smoking among teenagers. Give at least three 
convincing reasons to explain your stand on the issue. To support your stand on the issue, 
you have to do the following: 

i) Quote some of the information in extracts A and B in your essay. 

ii) Insert Table 1.1 of Extract C in your essay and comment on the 
information provided in the table. 

You are allowed to add more points of your own. 

The length of your essay should not be less than 500 words. Before you begin writing, 
take 5-10 minutes to read the question and plan your essay. Then, in about 60 minutes, 
write your 5 paragraph essay. Use the last 5 minutes to revise or edit your essay. 




